IAES International Journal of Artificial Intelligence (IJ-AI) 
Vol. 12, No. 3, September 2023, pp. 1419~1427 
ISSN: 2252-8938, DOI: 10.1159 1/ijai.v12.13.pp1419-1427 O 1419 


Emotions and gesture recognition using affective computing 
assessment with deep learning 


Herjuna Artanto, Fatchul Arifin 


Department of Electronics and Informatics Engineering Education, Faculty of Engineering, Universitas Negeri Yogyakarta, Yogyakarta, 


Indonesia 
Article Info ABSTRACT 
Article history: Emotions have an important role in education. Affective development, 
. attitudes, and emotions in learning are measured using affective assessment. 
Received Jun 23, 2022 This method is the right way to determine the student’s affective 
Revised Sep 27, 2022 development. However, the process did not run optimally because the 
Accepted Oct 11, 2022 teacher found it difficult to collect student’s affective data. This paper 


describes the development of a system that can assist teachers in carrying out 


affective assessment. The system was developed using a v-model that aligns 
Keywords: the verification phase with the validation. The use of the system is carried 
out during learning activities. The emotion detection system detects through 
body gestures using PoseNet to generate emotional data for each student. 
The detection results are then processed and displayed on an information 


Affective assessment 
Body gestures 


Deep learning system in the form of a website for affective assessment. The accuracy of 
Emotions emotion detection got validation values of 84.4% and 80.95% after being 
Posenet tested at school. In addition, the acceptance test with the usability aspect of 


the system by the teacher got a score of 77.56% and a score of 79.85% by 
the students. Based on several tests carried out, this developed system can 
assist the process of implementing affective assessment. 


This is an open access article under the CC BY-SA license. 


Corresponding Author: 


Fatchul Arifin 

Department of Electronics and Informatics Engineering Education, Faculty of Engineering, Universitas 
Negeri Yogyakarta 

1st Colombo Street, Karangmalang Campus, Yogyakarta 55281, Indonesia 

Email: fatchul @uny.ac.id 


1. INTRODUCTION 

Emotion is an important aspect that can influence attitudes, decision making, and human 
communication [1]. In the world of education, emotions can affect the student's learning process [2]. The 
influence of emotions on the student learning process resulted in the level of acceptance of students in 
participating in learning [3]. Based on research, motivation affects student responses when faced with 
difficult learning materials [4]. More than that, emotions can increase students' environmental awareness [5]. 
In addition to having an impact on students, a teacher's job satisfaction is also influenced by emotions [6]. 
Therefore, a teacher must be able to provide an affective assessment that can measure the emotions and 
behavior of students to determine learning achievement [7]. 

Affective assessment is important because it supports a student's cognitive processes. Research 
conducted by Morshead [8], found that the affective domain affects cognitive. However, assessing the 
affective domain is more difficult than assessing the cognitive domain because it converts naturally feelings 
and attitudes into the cognitive domain [9]. In Indonesia, affective assessment is regulated under the Ministry 
of Education which called educational assessment standards. According to the national standards of this 
assessment, Indonesia translates it into an attitude assessment using the observation method. However, in 
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practice the assessment is more dominant in the cognitive domain [10]. This is reinforced by Rimland [11] 
that the cause is that the assessment of the affective domain is quite difficult to apply. Measuring the 
affective domain there are factors that make it difficult to measure directly [12]. In addition, research on 
affective assessment in particular is still rare because it is difficult to evaluate [10]. Therefore, research and 
development of affective assessment must be able to solve these problems. 

Several studies have been conducted to measure students’ affectiveness. The development of 
affective assessment instruments textually can use the achievement emotions questionnaire (AEQ) [13]. 
Moreira et al. [14] through their research can show the relationship between emotional achievement and 
personality using AEQ-Mathematics. However, according to Pekrun [13] the self-report method may be able 
to produce bias due to student subjectivity. According to him, it needs to be equipped with other approaches 
to measure student psychology. Along with the development of technology, a new approach emerged, 
namely affective computing, which was initiated by Picard [15]. Affective computing allows the detection of 
human emotions and affective through signals obtained from facial activity, posture, gesture, hand 
movement, vocal, textual, and electrodermal or skin activity [16]. 

Research on detecting human emotions using technology has been widely carried out. D’Mello and 
Graesser [17] detected boredom, confusion, and frustration of students through dialogue features and body 
postures when learning to use a camera while learning in AutoTutor as well as Hussain et al. [18]. 
Furthermore, D’Mello and Graesser [19] used a cognitive disequilibrium model to explain the dynamic 
condition of students' emotions. Arguel et al. [20] detected confusion through capturing facial expressions 
and conversational cues in interactive digital learning environments (IDLEs). Thomas and Mathew [21] and 
Lyons et al. [22] identified facial expressions to detect human emotions. Ko [23] utilizes several artificial 
intelligence algorithms to detect visual information (face), while Kratzwald et al. [24] through text. Chen and 
Lee [25] detected students’ nervousness, excitement, and calmness by using a human pulse sensor. Behoora 
and Tucker [26] and Sun et al. [27] detect and classify emotional states through human body gesture patterns. 
In addition, the detection of emotions through sound was also carried out by research by Davis et al. [28]. 

Other studies apply more dominant emotion detection in the learning environment such as the use of 
the intelligent tutoring system (ITS) [29]-[37]. In previous studies, there was nothing specific on affective 
assessment but rather on ITS performance [1], [2]. Putra and Arifin [38] developed a system that is used to 
monitor students' mood conditions during classroom learning. This monitoring of mood is done so that a 
teacher can minimize student stress conditions. Detection of emotions during learning can be done. Some 
basic emotions according to research by Nasuha et al. [39] can be classified through facial expressions. These 
emotions include anger, sadness, joy, neutrality, fear, disgust, and surprise. In addition, through body 
gestures, emotions such as interest, joy, frustration, and boredom can be detected [26]. 

In the development of deep learning, there is a model that can be used to predict body gestures, 
namely PoseNet [40], [41]. This model detects body gestures based on 17 joint points in the human 
body. Based on the theory that body gestures describe human emotions, PoseNet can be used to carry out 
these activities. PoseNet is used to detect the positions of human joints that make up body gestures. 
The pattern of joint position is then classified into recognizable emotions such as interest, neutral, bored and 
frustrated. This model will classify according to the pattern it recognizes using the convolutional neural 
network (CNN) architecture. Some CNN architectures that are lightweight and can be used for mobile are 
MobileNet [42]-[47]. 

Emotions have a strong influence on students’ cognitive abilities [15]. Therefore, the learning 
process must be able to manage the affective domain well. Through the approach of affective computing 
technology, it is proven to be able to show the authentic emotional state of students. Utilization of data 
generated from PoseNet detection can be used as a basis for conducting affective assessments. The data is 
processed in such a way that it can become information. The development of information systems is expected 
to help teachers to manage their classes better by paying attention to the emotional conditions of their 
students. The purpose of this study was to produce a product that can provide authentic student emotional 
data so that teachers are assisted in the implementation of their affective assessment. 


2. METHOD 

This research uses research and development methods to produce a product. The development 
procedure used is to use one of the software developments approaches, namely the V-model which is the 
development of the waterfall model. This procedure is used because it prioritizes testing activities that assist 
in software development [48]. The research and development procedure used is the V-model. The 
development stage with the model used can be seen in Figure 1, including: 1) requirements analysis, 2) 
system design, 3) architectural design, 4) module design, 5) coding, 6) unit testing, 7) integration testing, 8) 
system testing, and 9) acceptance testing. 
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Figure 1. V-model development procedure framework 


In the needs analysis stage, the basics of system development are generated. Because it is associated 
with the acceptance testing stage, the acceptance test is designed using the usability aspect of the 
international organization for standardization (ISO) 25010 standard. After the system concept is defined, then 
the system design stage details all system components including product descriptions, use case diagrams, 
scenarios, activity diagrams, and the selection of PoseNet as deep learning models. Likewise, at this stage a 
system test is designed using all aspects of ISO 25010 except usability and functional suitability. This is 
because usability is used for acceptance testing and functional suitability is used for integration testing. The 
next stage of system design is architectural design that details system components by producing entity 
relationship diagrams, class diagrams, sequence diagrams, user interface (UI) mockups and selecting the 
MobileNet architecture on PoseNet. The next stage is to design the specific modules used in the system. 
After the entire design is generated, it is then implemented into the program code so that it becomes an 
information system. The resulting product is then tested successively, namely unit testing (module test) and 
integration testing with functional suitability and deep learning model testing. The next step is to test the 
system with aspects of ISO 25010 and test its acceptance and accuracy of emotion detection. 


3. RESULTS AND DISCUSSION 

The product produced in this research is a system based on deep learning. Product testing consists of 
the quality of the information system using the ISO 25010 standard and the quality of emotion detection. The 
confusion matrix and comparison of the detection results with the results of the AEQ instrument are used to 
determine the accuracy of emotion detection. The quality of the information system shows good results based 
on its conformity with the standards used. This information system provides emotion detection data as a basis 
for conducting affective assessments. The system produces data on five student emotions including happy, 
neutral, no emotion, bored, and dissappointed. Detection is carried out when learning uses each student's 
webcam. 

Recognitioning student’s emotion using gesture can be seen in the Figure 2. The emotion classes in 
this system are encoded with letters where the letter A as happy emotion which shown in Figure 2(a), 
B as neutral emotion, C absence of emotion shown, D students showing bored emotion which shown in 
Figure 2(b), and E when students showing disappointment. The use of deep learning using poseNet allows us 
to detect skeletal joints as the basis for determining emotional classes. poseNet detects 17 skeletal joints from 
head to toe. However, in this emotion detection system, only 11 skeletal joints are needed. This condition 
happened because the use of the system is carried out when students take part in learning in a sitting position. 
Recording using a webcam in these conditions can only be done from the head to the stomach. The skeletal 
joint points are the eyes, ears, nose, shoulders, elbows, and wrists. This combination of skeletal joints can 
show the variation of emotions that is being experienced by someone who is being detected [26]. 
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Figure 2. Comparing emotion and gesture recognition in (a) happy and (b) bored emotion 


The use of the above technique allows affective assessment to be carried out by assessing the 
tendency of students’ emotional detection data that occurs during learning. The order of the affective 
assessment process using this system can be seen in Figure 3. The assessment activity begins with the teacher 
opening the class and students making a presence in the class they have enrolled in. After students receive 
information about being present in the class, emotion detection will run. During the learning process, 
students’ body gestures will be recognized according to the predetermined emotional classes. When learning 
has been completed, the teacher needs to close the class so then emotion detection can be stopped and all 
student’s data can be saved. The detection results are then stored into the system to obtain trend information. 


Teacher opened the 
class 


Student presences on the 
enrolled class 


Teacher closed the 


saved to system 


Emotion detection 
running 


eacher analyze each 
student data 


Class done? 


Figure 3. Affective assessment process 


Affective assessment can be done by following the steps above. The emotional tendencies that are 
obtained from the detection of emotions become material for students’ affective assessment. Results of the 
detection of these emotions are displayed in a graph like the Figure 4. Student’s emotions detection and their 
assessment can be seen on the left side in Figure 4(a). Radar type chart is used to indicate which emotion 
class is more dominant. Every student’s emotion detection in classroom also can be used to evaluate teaching 
style based on the data. Emotion class that dominant in classroom can be seen on the right side in Figure 4(b) 
of the Figure 4. 
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Figure 4. Comparing the presentation of emotion detection data in (a) a student and (b) a classroom view 


Based on the results of the detection above, it is necessary to test the quality of detecting emotions. 
The test is carried out in two ways, namely the confusion matrix and the accuracy test using the AEQ 
instrument as the actual class. Testing with the confusion matrix can be seen in Figure 5. This image shows 
the accuracy of emotion detection in each emotion class that has been determined. This test uses 250 
detection data with 50 data for each emotion class. The system has successfully detected 42 classes of happy 
emotions, 50 neutral emotions, 41 no emotions, 34 bored emotions, and 44 disappointed emotions. 
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Figure 5. Confusion matrix 


Based on the confusion matrix above, it is possible to calculate the detection accuracy value. The 
values needed to perform the calculation include the number of true detections and the number of false 
detection data. False detection is divided into two, namely false positive (FP) and false negative (FN). Of the 
250th test data, the happy emotion was detected correctly as many as 42 of the 50th test data. The system 
successfully detects poses with a neutral emotion class of 50 out of 50, no 41 out of 50, bored 34 out of 50, 
and disappointed 44 out of 50 test data. The calculation of accuracy is done by combining all the correct 
detection results against the total numbers of data. Table 1 is for calculating the detection accuracy by the 


system. 
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Table 1. Calculation of the accuration value 
Category of Emotion Happy Neutral Noemotion Bored Disappointed __ Total 
True detection (T) 42 50 41 34 44 211 
Accuracy = T/N = 211/250 = 0.844 = 84.4% (N = 250) 


Based on the table, the number of correct detections is 211" data and the total number of data is 250 
so that the detection accuracy gets a value of 84.4%. After knowing the results of the accuracy of the system, 
further testing needs to be done by comparing the results of the detection by the system with the actual 
emotional conditions of students. To confirm the actual emotional state of the students, the AEQ 
questionnaire was given. This accuracy test is done by dividing the categories of emotions into positive and 
negative. In the Figure 6 you can see the difference in the accuracy of the positive and negative emotion test 
value by the system and AEQ. 
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Figure 6. Results on Positive and Negative emotions by system and AEQ questionnaire 


From the Figure 6, it shows the difference between the detection results by the system and AEQ. 
This difference shows the level of accuracy in detecting students' emotions. Positive emotions include 
interest and neutral which are classified from students’ poses. Besides that, negative emotions include bored 
and dissappointed. Based on the calculation, the tests conducted on 34 students showed better average 
difference of 17.17 in the positive emotion class than in the negative emotion class which is 20.93. The value 
calculated based on the difference between the actual condition (AEQ) and the detection results by the 
system. Its ability to detect wether positive or negative emotions can be known from the difference as the 
Figure 7 shown. It indicates that the ability to detect positive emotion needs to be more improved as the 
actual condition higher than its detection. On the other hand, the fluctuate value of negative class emotions 
denote the detection is too sensitive by the body pattern change. 


Difference detection on emotions 
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Figure 7. Differences detection results on emotions by system and AEQ questionnaire 
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Based on the value of each type of emotion’s difference, the average earned 19.05% which means 
that the actual system accuracy is 80.95%. In addition to getting the accuracy, the system is also tested for 
acceptance to the users. The test carried out using the USE instrument to measure the usability aspect of the 
product being developed. Usability aspects include usefulness, ease of use, ease of learning, and satisfaction. 
Usability testing by the teacher got an average final score of 77.56%. This value consists of usefulness 
74.17%, ease of use 80%, ease of learning 85%, and satisfaction 77.33%. According to the teacher and based 
on the results of the test, the system is easy to learn. However, the value of the usability of the developed 
system got a score of 77.56 which means it is quite useful to be applied. The teacher assesses the system 
based on these four aspects, while the students just evaluate the aspects of ease of use, ease of learning, and 
satisfaction. The developed system according to the students got an average 79.58% in the testing. The level 
of student satisfaction with the system is 83.38% which is quite high, the ease of use and the ease of learning 
the system are 79.89% and 78.82%, respectively. According to the results of usability testing by students, the 
system is not only quite easy to use and learn but also provides satisfaction. Figure 8 shows diagram of the 
average assessment in usability testing. 
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Figure 8. Average assessment by teacher and students 


4. CONCLUSION 

This study aims to develop a product that can assist the implementation of affective assessment in 
schools. Based on the research that has been done, it can be concluded that this affective computing 
assessment system are: i) Able to detect emotions through student body gestures; ii) Able to provide data on 
detecting students' emotions as a basis for affective assessment; iii) The detection of emotions has an 
accuracy of 84.4%; iv) The actual emotion detection accuracy is 80.95%; v) Acceptance by teachers on the 
usability aspect got 77.56%; vi) Acceptance by students on the usability aspect got 79.58%. Based on this 
conclusion, the research that has been done can be used in the school scope. In addition, this research still has 
the opportunity to be developed further. Future research is expected to be able to improve the quality of 
emotion detection by improving the quality of the dataset and model used. 
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